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(57) Abstract 

Apparatus and a concomitant method for estimating motion vectors having as an input a first image frame (300) and a second image 
fr ame ( 302X each containing a plurality of pixels representing an image. The apparatus comprises: a pyramid processor (102) for decimating 
a search area (316) within the first image frame (300) to produce a reduced resolution search area (304) and for decimating a block of 
pixels in the second image frame (302) to produce a reduced resolution pixel block (308); global search system (104) for performing a 
global search within the reduced resolution search area (310) using the reduced resolution pixel block (308) until the reduced resolution 
pixel block substantially matches a matching block of pixels in the reduced resolution search area (310); and means (106) for computing 
an estimated motion vector representing a distance between a location of the reduced resolution pixel block within the second image frame 
and a location of the matching block of pixels within the first image frame. 
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METHOD AND APPARATUS FOR GLOBAL-TO-LOCAL 
BLOCK MOTION ESTIMATION 

The invention relates generally to a system for encoding image 
5 sequences and, more particularly, to apparatus and a concomitant method for 
reducing the computational complexity in determining motion vectors for 
block-based motion estimation. 

An image sequence, such as a video image sequence, typically includes 
a sequence of image frames. The reproduction of video containing moving 
1 0 objects typically requires a frame speed of thirty image frames per second, 
with each frame possibly containing in excess of a megabyte of information. 
Consequently, transmitting or storing such image sequences requires a large 
amount of either transmission bandwidth or storage capacity. To reduce the 
necessary transmission bandwidth or storage capacity, the frame sequence is 

1 5 compressed such that redundant information within the sequence is not stored 

or transmitted. As such, image sequence compression through various 
encoding techniques has been the subject of a great deal of research in recent 
years. Television, video conferencing and CD-ROM archiving are applications 
which can benefit from video sequence encoding. 

2 0 Generally, to encode an image sequence, information concerning the 

motion of objects in a scene from one frame to the next plays an important 
role in the encoding process. Because of the high redundancy that exists 
between consecutive frames within most image sequences, substantial data 
compression can be achieved using a technique known as motion estimation. 

2 5 For example, if there is no movement in a sequence, each frame in a sequence 

is identical to the preceding frame in that sequence. Therefore, the redundant 
frames do not have to be stored or transmitted. As such, a receiver, for 
example, can simply repeat a previously received frame to reproduce a 
sequence of identical frames without necessarily receiving each of the frames 

3 0 in the sequence. This no motion case is the simplest case in which the 

redundancy between consecutive frames of a video sequence can be exploited 
to predict a new frasi ; using previous frames. 

In general, however, there is at least some motion from one frame to 
the next in an image sequence. In a sequence containing motion, a current 
3 5 frame can be reconstructed using an immediately preceding frame and 
information representing the difference between the current and the 
immediately preceding frame. For example, in a simple image sequence 
transmission system, at the transmitter, a current frame is compared to a 
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preceding frame to determine motion information, i.e., the difference between 
the two frames. Thereafter, the transmitter transmits the preceding frame 
and the motion information to a receiver. At the receiver, the current frame is 
reconstructed by combining the preceding frame with the motion information. 
5 Consequently, only 1 frame and difference information is transmitted and 
received rather than two entire frames. To further reduce the required 
bandwidth, the reference frame (e.g., the preceding frame) can be compressed 
using various subsampling techniques. In applications such as video 
conferencing, video telephone, and digital television, motion information has 
1 0 become the key to data compression. However, extraction of the motion 
information from the frame sequence is itself computationally intensive, 
placing a heavy burden on the hardware designed to perform the motion 
estimation task. 

Many systems determine motion information using a so-called block 

1 5 based approach. For examples of various block based approaches, see U.S. 

patents 4,924,310 issued May 8, 1990, 5,105,271 issued April 14, 1992, and 
5,210,605 issued May 11, 1993. In a simple block based approach, the 
current frame is divided into a number of blocks of pixels (referred to 
hereinafter as the current blocks). For each of these current blocks, a search 

2 0 is performed within a selected search area in the preceding frame for a block of 

pixels that "best" matches the current block. This search is typically 
accomplished by repetitively comparing a selected current block to similarly 
sized blocks of pixels in the selected search area of the preceding frame. Once 
a block match is found, the location of matching block in the search area in 

2 5 the previous frame relative to the location of the current block within the 

current frame defines a motion vector. This approach, i.e., comparing each 
current block to an entire selected search area, is known as a full search 
approach or the exhaustive search approach. The determination of motion 
vectors by the exhaustive search approach is computationally intensive. A 

3 0 such, these systems tend to be relatively slow in processing the frames and 

expensive to fabricate. 

Therefore, there is a need in the art for apparatus and a concomitant 
method of block motion estimation having less computational intensity than 
presently exists in the art 
3 5 The present invention overcomes the disadvantages of the prior art by 

providing a global-to-local block motion estimation apparatus and 
concomitant method for determining motion vectors in a computationally 
efficient maimer. Specifically, the present invention utilizes a two-step 
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process in which, first, a plurality of estimated motion vectors are produced, 
and second, the estimated motion vectors are refined to final motion vectors. 
The first step is accomplished by filtering and decimating a sequence of image 
frames such that the apparatus produces a sequence of low resolution frames. 
5 Each low resolution frame is partitioned into a plurality of blocks of pixels. 
Using the consecutive, low resolution frames, the apparatus performs an 
exhaustive search (hereinafter referred to as a global search) to produce 
estimated motion vectors. 

In particular, within two selected low resolution frames (a current 
I 0 frame and a preceding frame), the apparatus respectively defines a plurality 
of current blocks of pixels and a plurality of preceding blocks of pixels. The 
preceding blocks are individual search areas having more pixels than the 
current blocks. The apparatus selects for processing a current block and a 
preceding block. The selected current block is then compared, using an 
5 exhaustive search strategy, to the selected search area until a block of pixels 
within the search area is found that substantially matches the current block. 
The distance between location of the matching block in the preceding frame 
and the location of the current block in the current frame defines an estimated 
motion vector. This search process repeats for each and every current block 
0 until the apparatus determines an estimated motion vector for each current 
block. Since the search is accomplished upon reduced flow) resolution images, 
the search is accomplished relatively quickly. 

Once the estimated motion vectors are computed, the apparatus then 
selects preceding and current frames in the full resolution frame sequence 
5 that correspond to the low resolution frames used to estimate the motion 
vectors. Using these consecutive, full resolution frames and the estimated 
motion vectors, the apparatus performs a modified exhaustive search 
(hereinafter referred to as a local search) to produce final motion vectors. 
Specifically, within these full resolution frames, the apparatus partitions the 
0 frames into preceding blocks and a current blocks of pixels. These blocks 
within the full resolution frames correspond to the same blocks within the low 
resolution frames. As such, the preceding block defines a search area having 
a size that is larger than the current block. The apparatus searches the 
search area to determine a match between the current block and the search 
5 area. However, the estimated motion vector for a given low resolution current 
block provides a initial search starting location for the full resolution current 
block within the full resolution search area. As such, a match is rapidly found 
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without performing an exhaustive search within the full resolution search 
area. 

To improve the noise immunity of the global search, the low resolution 
current blocks are formed as metablocks. These metablocks are defined by a 
5 group of adjacent low resolution current blocks. Typically, the metablocks are 
partitioned into sub-blocks, where each sub-block contains a plurality of 
current blocks. The metablock is used in the global search in lieu of an 
individual current block. When a match is found for the metablock with pixels 
within the low resolution search area, a motion vector is computed for each 
1 0 sub-block within the metablock as well as the metablock as a whole. The 
apparatus assigns to each of the current blocks comprising the metablock an 
estimated motion vector. These motion vectors are selected from either the 
motion vector of the sub-block containing the current block, the motion vector 
associated with the metablock as a whole, or some combination of the 

1 5 sub-block and metablock motion vectors. Once, estimated motion vectors are 

assigned to the current blocks, a local search is completed as discussed above. 

The teachings of the present invention can be readily understood by 
considering the following detailed description in conjunction with the 
accompanying drawings, in which: 

2 0 Fig. 1 is a block diagram of the global-to-local block motion estimation 

system of the invention; 

Fig. 2 depicts a flow chart illustrating the operation of the system 
shown in Fig. 1; 

Fig. 3 is a representation of a sequence of images processed by the 

2 5 motion estimation system of Fig. 1; 

Figs. 4 and 5 are representations of metablocks of pixels which are 
processed in an alternative embodiment of the motion estimation system of 
Fig.l; 

Fig. 6 is a representation of a sequence of images processed in an 

3 0 alternative embodiment of the motion estimation system of Fig. 1; and 

Fig. 7 is a representation a systolic array for use determining the 
motion vector information of sub-blocks within the metablocks of Fig. 5 within 
an alternative embodiment of the block motion estimation system of Fig. 1. 

To facilitate understanding, identical reference numerals have been 
3 5 used, where possible, to designate identical elements that are common to the 
figures. 

Fig. 1 depicts a block diagram of a preferred embodiment of the present 
invention. The present invention, a global-to-local motion estimation 
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system 100, contains a pyramid processor 102, a global search system 104 
and a local search system 106. Specifically, a sequence of image frames form, 
at lead 108, an input to the motion estimation system 100. This image 
sequence is an input to both the pyramid processor 102 and the local search 
5 system 106. In general, the pyramid processor 102 filters (filter 112) and 
decimates (image decimator 114) each image frame as the frames arrive at 
the input, producing a sequence of reduced resolution image frames. The 
global search system 104 analyzes these reduced resolution frames to 
produce a plurality of estimated motion vectors. Lastly, the local search 
1 0 system 106 analyzes the input sequence of frames with the aid of the 
estimated motion vectors to produce a plurality of final motion vectors. These 
final motion vectors can then be used by a video processing system (not 
shown) to compress the video information within the image frames. 

More specifically, within the pyramid processor 102, the filter 112 is 

1 5 typically a Gaussian filter that performs weighted sum operations using 

adjoining pixel values within a frame. The filtered pixel is a normalized 
weighted sum of an input pixel with decreasing contributions from increasingly 
distant neighboring pixels. The image decimator 114 is a convention pixel 
subsampling circuit. The output of the pyramid processor is a sequence of 
20 image frames wherein each frame has a lesser resolution than its 
corresponding input frame. Illustratively, the decimator 114 is an eight times 
decimator that reduces the number of pixels in a given square area by 1/64. 
Hereinafter, the frames produced by the pyramid processor 102 are referred 
to as the low resolution frames. 

2 5 The global search system 104 compares two sequential low resolution 

frames to determine estimated motion vectors for the low resolution images. 
Specifically, the low resolution frames are partitioned into blocks of pixels 
such that a current frame contains a plurality of current blocks and a 
preceding frame contains a plurality of preceding blocks. The preceding blocks 

3 0 contain a greater number of pixels than the current block, e.g., each current 

block is 2 pixels by 2 pixels and each preceding block is 32 pixels by 32 pixels. 
Each current block is repetitively compared to a selected preceding block until 
a match is found, e.g., 4 pixels in the current block are compared to 4 pixels 
out of 1024 pixels in the preceding block, then the 4 current block pixels are 
3 5 moved and again compared, and so on. As such, within the low resolution 
search area, the system performs an exhaustive search that is hereinafter 
referred to as a global search. The blocks which produce the lowest computed 
error will be the match. Alternatively, when the computed error is less than a 
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predefined error threshold, the routine deems a match found. When a match is 
found, the difference in location of the current block in the current low 
resolution frame and the location of the block of pixels that matches the 
current block within the preceding block define an estimated motion vector. 
5 This process is repeated for each current block until a motion vector is 
computed for each current block in the current low resolution frame. These 
estimated motion vectors are sent, via line 110, to the local search system. 

The local search system 106 performs a modified exhaustive search 
(hereinafter referred to as a local search) using pixels within two consecutive 
1 0 full resolution frames in the input frame sequence. These two frames 
correspond to the two frames previously decimated by the pyramid 
processor 102 and compared by the global search system 104. The local 
search system partitions a current and preceding full resolution frames into a 
number of current blocks of pixels and a number of preceding blocks of pixels. 

1 5 The current block contains less pixels than the preceding block, e.g., each 

current block contains a 16 pixel by 16 pixel area and each preceding block 
contains 256 pixel by 256 pixel area. The local search system begins its 
comparison using the estimated motion vectors supplied by the global search 
system. As such, the comparison, within the full resolution frame, begins at a 

2 0 location that places the current block within a corresponding preceding block 

that is very near the "best" match location. The current block typically needs 
only a small amount of movement until a match is found. Thus, by using the 
estimated motion vectors, the local search is significantly faster than a 
conventional exhaustive search. The system output is a sequence of final 

2 5 motion vectors representing the motion from one frame to the next in the 

input frame sequence. Alternatively, the system may output the matched 
block of pixels within the preceding full resolution frame such that those 
matched pixels can be further processed by an image processing system. 

Fig. 2 depicts a flow chart of a routine that illustrates the operation of 

3 0 the system shown in Fig. 1. Fig. 3 depicts a representation of both current 

and preceding full and low resolution frames. To best understand the 
operation of the method of operation of the present invention, the reader 
should simultaneously consult both Figs. 2 and 3. 

At step 200, the full resolution frame sequence is input into the system. 
3 5 In Fig. 3, the full resolution frame sequence is represented by frames 300 (the 
preceding frame) and 302 (the current frame). At step 202 in Fig. 2, each 
frame is filtered, typically, by Gaussian filtering. Thereafter, at step 204, 
each frame is decimated to produce low resolution frames. These low 
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resolution frames are depicted as frames 304 (preceding low resolution frame) 
and 306 (current low resolution frame). At step 206, the routine selects a 
current low resolution frame 306 and a preceding low resolution frame 304. 
The selected frames are partitioned, at step 208, into blocks of pixels, i.e., 
5 defining current and preceding blocks. At step 210, the routine selects a 
current block 308 and a corresponding preceding block 310. The 
corresponding preceding block defines a search area within which a match to 
the current block is sought. At step 212, the current and preceding blocks are 
compared (represented in Fig. 3 by arrow 312). The routine queries, at 
1 0 step 214, whether a match is found. If the answer to the query at step 214 is 
negative, the routine proceeds along the no path to step 216. At step 216, the 
routine repositions the current block within the preceding block and returns to 
step 212 where another comparison is accomplished. The routine loops 
through step 216 until a substantial match is found at step 214. One 

1 5 illustrative technique for making a match decision calculates the minimum 

square error or the mean absolute error for the various pixel comparisons 
performed by the global search system. The blocks which produce the lowest 
computed error will be the match. Alternatively, when the computed error is 
less than a predefined error threshold; the routine deems a match found. 

2 0 Thereafter, the routine determines, at step 218, an estimated motion vector 

for the current block presently being processed. Once the motion vector is 
computed, the routine queries, at step 220, whether all of the current blocks 
now have motion vectors associated with them. If not, the routine proceeds 
along the NO path 222 to step 210. At step 210, the another current block 

2 5 and associated preceding block are selected and, thereafter, processed 

(globally searched) to determine an estimated motion vector. If the query at 
step 220 is answered affirmatively, the routine proceeds to step 224. At this 
point, the routine has computed a set of estimated motion vectors for each 
and every current block in the current low resolution frame. 

3 0 At step 224, the routine selects a current and preceding frame 302 and 

300 from the full resolution input sequence that correspond to the current and 
preceding low resolution frames 306 and 304 used to determine the presently 
available set of estimated motion vectors. The selected input frames are 
partitioned, at step 226, into current and preceding blocks. Importantly, 
3 5 these current and preceding blocks correspond in position within the frame 
with the current and preceding blocks derived from the low resolution frames. 
As such, any given current or preceding block in the full resolution frames has 
a low resolution equivalent in the low resolution frames. 



WO 96/08114 



8 



PCT/US95/10986 



At step 228, the routine selects from the full resolution frames a 
current block 314 and a corresponding preceding block 316. The 
corresponding preceding block defines a search area within which a match to 
the current block is sought. At step 230, an estimated motion vector is used 
5 (represented in Fig. 3 by arrow 318) to initially position the current block 314 
within search area defined by the preceding block 316. The estimated motion 
vector is the vector that is associated with the current block 308 within the 
low resolution frame that corresponds to the current block 314 in the full 
resolution frame 302. At step 232, the current and preceding blocks are 
1 0 compared (represented in Fig. 3 by arrow 320). The routine queries, at 
step 234, whether a substantial match is found. The local search system 
performs an exhaustive search using minimum square error or mean absolute 
error techniques to determine the "best" match between blocks. Since the 
initial position of the current block is estimated by the global search system, 

1 5 the required search range for the local search is typically only 2 to 4 pixels. If 

the answer to the query at step 232 is negative, the routine proceeds along the 
no path to step 236. At step 236, the routine repositions the current 
block 314 within the preceding block 316 and returns to step 232 where 
another comparison is accomplished. The routine loops through step 236 until 
20 a match is found at step 234. Thereafter, the routine determines, at step 238, 
a final motion vector for the current block presently being processed. Once 
the final motion vector is computed, the routine queries, at step 240, whether 
all of the current blocks now have final motion vectors associated with them. 
If not, the routine proceeds along the NO path 242 to step 228. At step 228, 

2 5 the another current block and associated preceding block are selected and, 

thereafter, processed to determine a motion vector. If the query at step 240 is 
answered affirmatively, the routine steps at step 244. 

For simplicity, the illustrative preceding blocks 310 and 316 are 
depicted in Fig. 3 as respectively containing 256 x 256 pixels and 32 x 32 

3 0 pixels. However, in practice, these search areas require one extra row and one 

extra column of pixels to allow for movement of the current blocks 314 and 
308 within the search areas. Therefore, the dimensions of the search areas 
are actually 256+1 pixels plus the length of the associated current block (16 
pixels) and 32+1 pixels plus the length of the associated current block (2 
3 5 pixels), respectively, in each dimension. 

As previously described the low resolution current block 308 covers a 
2x2 pixel area. Thus, the block contains only four pixels. As such, 
comparisons accomplished by the global search system using a current block 
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containing only four pixels may produce unsatisfactory results. Specifically, 
spurious matches may be located by the global search system because the 
small number of pixels in the current block tend to correlate with noise. 

Therefore, within the global-to-local block motion estimation system, an 
5 alternative to using a single low resolution current block uses a number of 
current blocks Bq q through B n n arranged into a metablock 400 as shown in 

Fig 4. In the preferred embodiment of the motion estimation system, a 
metablock 402 is formed of sixteen current blocks Bq o~^3 3. Specifically, 
each current block Bqq-B^s within the metablock 402 includes four pixels 
1 0 arranged in a 2x2 matrix corresponding to an undecimated 16x16 area within 
the current full resolution frame. 

Referring to Fig. 5, there is shown the metablock divided into a number 
of differing test partitions (also referred to as sub-blocks). For example, the a 
metablock may be divided in half vertically to form metablock 500 having two 

1 5 test partitions 504 and 506 wherein each test partition includes four rows and 

two columns of current blocks. A metablock may also be divided in half 
horizontally to provide metablock 502 having two test partitions 508 and 510 
wherein each test partition has two rows and four columns of test blocks. 
Within the block motion estimation system, a metablock may also be divided 

2 0 into any other convenient number of test partitions including, for example, 

into quarters to form the test partitions 512a-d within metablock 514 or into 
the overlapping test partitions 516, 518, 520, 522 within metablock 524. 

Fig. 6 shows an image processing representation 600 to demonstrate 
the use of metablocks in performing the global search. Within the image 

2 5 processing representation 600, the low resolution preceding frame 304 and the 

low resolution current frame 306 are the result of decimation operations 
performed by the filter and decimator upon the preceding input frame 300 and 
the current input frame 302, respectively, as previously described. As also 
previously described, the low resolution preceding block 310 is disposed within 

3 0 low resolution frame 304. The metablock 500 containing the low resolution 

current blocks Bq 0^3 3 is disposed within low resolution frame 306 When 

using metablocks, the global search system determines an estimated motion 
vector for each current block B00-B33 within each metablock. 

An illustrative 64 x 64 pixel fiill resolution metablock 602, disposed 
3 5 within the current input frame 302, contains sixteen current blocks Bq q - 
B3 3 which each contain 16x16 pixel full resolution current blocks. 
Effectively, each test block B\ j within the metablock 602 is operated upon by 
the decimator to provide a corresponding low resolution current block B; : 
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within the metablock 500. As such, each By corresponds to a Bj j. The 
current full resolution frame 302 is covered with a plurality of metablocks 
such as the illustrative metablock 602. The metablocks 602 within 
frame 302 do not overlap with one another. Each 64x64 pixel metablock 602 
S of this plurality of metablocks corresponds to an 8x8 low resolution metablock 
such as metablock 500. 

Using the global search, the metablock 500 is compared with the low 
resolution preceding block 310 (a low resolution search area) by the global 
search system for best match information. This comparison is represented 
1 0 by the arrow 312. In this manner, the global search system determines 
motion vector information for the overall metablock 500. Additionally, 
estimated motion vector information is determined for each of the test blocks 
B 0 0-B3 3 within the metablock 500. 

For example, each test block Bq q-B^ q * s simply assigned the motion 

1 5 vector information determined by the global search system for the overall 

metablock 500. This greatly simplifies the operation of the block motion 
estimation system. However, this method provides relatively low precision 
block motion estimation and poor image quality because the large size of the 
metablock may cause it to cover multiple objects moving in different 

2 0 directions. 

To improve the accuracy of the motion vector estimate, the system 
determines an estimated motion vector for each partition (sub-block) of the 
metablock 500. For example, the metablock 500 may be partitioned to form 
metablock 514 as shown in Fig. 5. In this case, five estimated motion vectors 

2 5 are determined at each match location. Specifically, one motion vector is 

determined for the overall metablock 514 and one is determined for each of the 
four test partitions 512a-d. If a metablock is partitioned to form the 
metablock 524, one estimated motion vector is determined for the overall 
metablock 524 and one is determined for each of the four overlapping test 

3 0 partitions 516, 518, 520, and 522. In each of these metablocks, the estimated 

motion vectors for each partition is assigned to its constituent current blocks. 
If, however, multiple test partitions overlap a particular current block, then 
either one of the motion vectors is selected for that current block or the 
motion vectors are, in some manner, combined into a single vector to 
3 5 represent that current block. Such combination can be accomplished by 
averaging, weighted averaging, and the like. 

In general, image quality can be increased by partitioning a metablock, 
determining the estimated motion vector information for each partition, and 
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assigning the motion vector information for each partition to the blocks Bq q- 
B n n within the partition. Furthermore, image quality may be further 
increased by dividing the metablock into a larger number of partitions. 
However, if a metablock is partitioned into a plurality of 2x2 pixel blocks, the 
5 results are poor as previously described with respect to the 2x2 pixel block. 
Therefore, a reasonable partitioning tradeoff is made considering such factors 
as resistance to noise, spurious matches, the amount of hardware required, 
noise immunity and computational intensity. 

As described previously, the use of partitioned metablocks in searching 
1 0 the low resolution frame produces multiple estimated motion vectors, i.e., one 
vector for the entire metablock and one for each partitioned area. These 
vectors are determined using techniques such as minimum square error to 
compute a best match for the metablock and its constituent components. 
These estimated vectors are then applied to the local search system as 

1 5 illustrated by pathway 318. 

Thus, assuming a two partition metablock, e.g., metablock 500, the 
local search system must analyze three candidate locations wherein a 
minimum square error match was found by the global search system. These 
three locations correspond to the overall low resolution metablock and to each 

2 0 of the two partitions therein. One of the three possibilities must therefore be 

selected for each local test block Bqo -B3 3 in order to perform the search of 

the local search system. 

One of the three possibilities can be eliminated immediately for each 
low resolution current block Bo 9-83 3 because each block Bq Q-B3 3 is 

2 5 outside of one of the test partitions 504 and 506. For example, when selecting 

a starting location for the test block Bq q the candidate search location 

corresponding to the estimated motion vector for the test partition 506 can be 
ignored because the test block Bq q is not located inside the test partition 506. 

Only the candidate search locations corresponding to the overall metablock 

3 0 500 and the test partition 504 must be considered. 

In this manner, a motion vector is selected for each block Bq q -B3 3' 

based upon either the overall metablock 500 or one of the test partitions 504, 
506 rather than based upon a single 2x2 pixel block Bq q-B 3 3. Because the 

metablock 500 and the test partitions 504, 506 are much larger than the 2x2 
3 5 blocks BQ Q-B3 3, the global-to-local block motion estimation system has 

greater resistance to noise because 2x2 test blocks are more likely to match 
randomly with small bit patterns within the low resolution search area. 
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The selection of a final motion vector is accomplished by determining 
the best match of the full resolution block B\ j for both of the two possible 

starting locations as indicated by the arrows 604 and 606. This involves 
selecting either the estimated motion vector corresponding to the overall 
5 metablock 500 or the appropriate test partition 504 and 506. When one of 
the two possible starting locations within search area 316 is selected, the local 
search system performs its search as previously described. 

In an alternate embodiment of motion estimation system, a full search 
can be performed for each of the two possible candidate search starting 
1 0 positions to provide two local search results. The better of the two local 
search results may be used to produce a motion vector. It is believed that this 
method may be less efficient than selecting a single estimated motion vector 
and performing a single search according to the selected motion vector. 

Referring to Fig. 7, there is shown an 8x8 systolic array 700. The 

1 5 systolic array 130 is a conventional systolic array which may be used within 

the global-to-local block motion estimation system to simultaneously 
determine the motion vector information of a metablock and each of the 
various test partitions within the metablock. Although the systolic array 700 
provides a convenient way to make these determinations, any known method 

2 0 for determining motion vector information may be used to determine the 

motion vector information of the various possible partitions which may be 
formed by dividing a metablock. 

The systolic array 700 is a conventional systolic array wherein each 
node N: \ receives its node input either from an array input or a node output, 

2 5 performs its node arithmetic operations, and applies its node output either to 

the array output or to the input of the another node. Therefore, when the 
differences determined by the comparisons of the global search system are 
applied to the systolic array, the estimated motion vector information may be 
calculated. In particular, the estimated motion vector information for various 

3 0 test partitions within the low resolution metablock 500 may be readily 

determined. The operations of the systolic array may be performed by 
program instructions executed by a computer or by separate hardware 
arithmetic circuits. 

Within the metablock 500, the estimated motion vector information for 
3 5 the test partition 504 is obtained by summing the outputs of the nodes N44 
and Ng 4 of the systolic array. The estimated motion vector information for 
the test partition 506 is obtained by summing the outputs of the nodes N4 g 
and Ng g. The estimated motion vector information for the overall metablock 
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500 is obtained by summing outputs of the motion vector information nodes 

N 4,4> N 4,8» N 8,4 8X111 N 8,8- 

Similarly, for the test partitions 508 and 510 of the metablock 502, the 
estimated motion vector information nodes N^+N^g and Ng 4+Ng g of the 

5 systolic array are summed. Within the test metablock 514, the estimated 
motion vector information node N4 4 corresponds to the test partition 512a, 
the estimated motion vector information node N4 g corresponds to the test 
partition 512b, the estimated motion vector information node Ng 4 

corresponds to the test partition 512c, and the estimated motion vector 
1 0 information node Ng g corresponds to the test partition 512d. • 

It will be appreciated by those skilled in the art that changes could be 
made to the embodiments described above without departing from the broad 
inventive concepts thereof. It is understood, therefore, that this invention is 
not limited to the particular embodiments disclosed, but it is intended to cover 
1 5 modifications within the spirit and scope of the invention as defined by the 
appended claims. 
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I CLAIM: 

1. A method for estimating motion vectors comprising the steps of: 
providing a first image frame and a second image frame, each 

5 containing a plurality of pixels representing an image; 

decimating a search area within said first image frame to produce a 
reduced resolution search area; 

decimating a block of pixels in said second image frame to produce a 
metablock containing a plurality of reduced resolution blocks of pixels 
1 0 partitioned into pre-defined sub-blocks; 

performing a global search within said reduced resolution search area 
using said metablock until said metablock substantially matches a matching 
block of pixels in said reduced resolution search area; and 

computing a plurality of estimated motion vectors representing a 

1 5 distance between a location of each of said reduced resolution pixel blocks 

within said metablock within said second image frame and a location of said 
matching block of pixels within said first image frame. 

2. The method of claim 1 further comprising the steps of: 

selecting an estimated motion vector from said plurality of estimated 

2 0 motion vectors; 

positioning, at an initial position defined by said selected estimated 
motion vector, a full resolution pixel block, from said second, image frame, 
within a full resolution search area within said first image frame; 

performing, from the initial position, a local search within said full 

2 5 resolution search area using said full resolution pixel block until said full 

resolution pixel block substantially matches a matching block of pixels in said 
full resolution search area; and 

computing a final motion vector representing a distance between a 
location of said full resolution pixel block within said second image frame and a 

3 0 location of said matching block of pixels within said first image frame. 

3. The method of claim 1 wherein said global search performing step 
further comprises the steps of: 

using said metablock until said metablock substantially matches a 
matching block of pixels in said reduced resolution search area; 
3 5 computing an estimated motion vector for each of said sub-blocks and 

for said metablock; and 

assigning an estimated motion vector to each of said reduced resolution 
blocks of pixels forming said metablock. 
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4. The method of claim 3 wherein said assigning step further comprises 
the steps of: 

selecting a reduced resolution block of pixels from said plurality of 
reduced resolution blocks of pixels; and 
5 assigning an estimated motion vector to said selected reduced 

resolution block of pixels, where said assigned estimated motion vector is 
selected from either the estimated motion vector computed for said metablock 
or one of the estimated motion vectors computed for said sub-blocks. 

5. The method of claim 3 wherein said assigning step further comprises 
0 the steps of: 

selecting a reduced resolution block of pixels from said plurality of 
reduced resolution blocks of pixels; and 

assigning an estimated motion vector to said selected reduced 
resolution block of pixels, where said assigned estimated motion vector is a 
5 combination of the estimated motion vector computed for said metablock and 
one or more of the estimated motion vectors computed for said sub-blocks. 

6. Apparatus for estimating motion vectors having as an input a first 
image frame and a second image frame, each containing a plurality of pixels 
representing an image, said apparatus comprising: 

a pyramid processor for decimating a search area within said first 
image frame to produce a reduced resolution search area and for decimating a 
block of pixels in said second image frame to produce a metablock containing a 
plurality of reduced resolution blocks of pixels partitioned into pre-defined sub- 
blocks; 

global search means, connected to said pyramid processor, for 
performing a global search within said reduced resolution search area using 
said metablock until said metablock substantially matches a matching block 
of pixels in said reduced resolution search area; and 

means, connected to said global search means, for computing a 
plurality of estimated motion vectors representing a distance between a 
location of each of said plurality of reduced resolution blocks within said 
metablock within said second image frame and a location of said matching 
block of pixels within said first image frame. 

7. The apparatus of claim 10 further comprising: 
means for selecting an estimated motion vector from said plurality of 

estimated motion vectors; 
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means for positioning, at an initial position defined by said selected 
estimated motion vector, a full resolution pixel block, from said second image 
frame, within a full resolution search area within said first image frame; 

local search means for performing, from the initial position, a local 
5 search within said full resolution search area using said full resolution pixel 
block until said full resolution pixel block substantially matches a matching 
block of pixels in said full resolution search area; and 

means for computing a final motion vector representing a distance 
between a location of said full resolution pixel block within said second image 
1 0 frame and a location of said matching block of pixels within said first image 
frame. 

8. The apparatus of claim 7 wherein said global search means further 
comprises: 

means for using said metablock until said metablock substantially 

1 5 matches a matching block of pixels in said reduced resolution search area; 

means for computing an estimated motion vector for each of said sub- 
blocks and for said metablock; and 

means for assigning an estimated motion vector to each of said reduced 
resolution blocks of pixels forming said metablock. 
20 9. The apparatus of claim 8 wherein said assigning means further 
comprises: 

means for selecting a reduced resolution block of pixels from said 
plurality of reduced resolution blocks of pixels; and 

means for assigning an estimated motion vector to said selected 

2 5 reduced resolution block of pixels, where said assigned estimated motion 

vector is selected from either the estimated motion vector computed for said 
metablock or one of the estimated motion vectors computed for said sub- 
blocks. 

10. The apparatus of claim 8 wherein said assigning means further 

3 0 comprises: 

means for selecting a reduced resolution block of pixels from said 
plurality of reduced resolution blocks of pixels; and 

means for assigning an estimated motion vector to said selected 
reduced resolution block of pixels, where said assigned estimated motion 
3 5 vector is a combination of the estimated motion vector computed for said 
metablock and one or more of the estimated motion vectors computed for said 
sub-blocks. 
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FIG. 2 



FIG. 2 A 



FIG. 2B 
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FIG. 3 
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